21 research outputs found

    Array signal processing algorithms for localization and equalization in complex acoustic channels

    No full text
    The reproduction of realistic soundscapes in consumer electronic applications has been a driving force behind the development of spatial audio signal processing techniques. In order to accurately reproduce or decompose a particular spatial sound field, being able to exploit or estimate the effects of the acoustic environment becomes essential. This requires both an understanding of the source of the complexity in the acoustic channel (the acoustic path between a source and a receiver) and the ability to characterize its spatial attributes. In this thesis, we explore how to exploit or overcome the effects of the acoustic channel for sound source localization and sound field reproduction. The behaviour of a typical acoustic channel can be visualized as a transformation of its free field behaviour, due to scattering and reflections off the measurement apparatus and the surfaces in a room. These spatial effects can be modelled using the solutions to the acoustic wave equation, yet the physical nature of these scatterers typically results in complex behaviour with frequency. The first half of this thesis explores how to exploit this diversity in the frequency-domain for sound source localization, a concept that has not been considered previously. We first extract down-converted subband signals from the broadband audio signal, and collate these signals, such that the spatial diversity is retained. A signal model is then developed to exploit the channel's spatial information using a signal subspace approach. We show that this concept can be applied to multi-sensor arrays on complex-shaped rigid bodies as well as the special case of binaural localization. In both c! ases, an improvement in the closely spaced source resolution is demonstrated over traditional techniques, through simulations and experiments using a KEMAR manikin. The binaural analysis further indicates that the human localization performance in certain spatial regions is limited by the lack of spatial diversity, as suggested in perceptual experiments in the literature. Finally, the possibility of exploiting known inter-subband correlated sources (e.g., speech) for localization in under-determined systems is demonstrated. The second half of this thesis considers reverberation control, where reverberation is modelled as a superposition of sound fields created by a number of spatially distributed sources. We consider the mode/wave-domain description of the sound field, and propose modelling the reverberant modes as linear transformations of the desired sound field modes. This is a novel concept, as we consider each mode transformation to be independent of other modes. This model is then extended to sound field control, and used to derive the compensation signals required at the loudspeakers to equalize the reverberation. We show that estimating the reverberant channel and controlling the sound field now becomes a single adaptive filtering problem in the mode-domain, where the modes can be adapted independently. The performance of the proposed method is compared with existing adaptive and non-adaptive sound field control techniques through simulations. Finally, it is shown that an order of magnitude reduction in the computational complexity can be achieved, while maintaining comparable performance to existing adaptive control techniques

    A Decoding-Complexity and Rate-Controlled Video-Coding Algorithm for HEVC

    Get PDF
    Video playback on mobile consumer electronic (CE) devices is plagued by fluctuations in the network bandwidth and by limitations in processing and energy availability at the individual devices. Seen as a potential solution, the state-of-the-art adaptive streaming mechanisms address the first aspect, yet the efficient control of the decoding-complexity and the energy use when decoding the video remain unaddressed. The quality of experience (QoE) of the end-users’ experiences, however, depends on the capability to adapt the bit streams to both these constraints (i.e., network bandwidth and device’s energy availability). As a solution, this paper proposes an encoding framework that is capable of generating video bit streams with arbitrary bit rates and decoding-complexity levels using a decoding-complexity–rate–distortion model. The proposed algorithm allocates rate and decoding-complexity levels across frames and coding tree units (CTUs) and adaptively derives the CTU-level coding parameters to achieve their imposed targets with minimal distortion. The experimental results reveal that the proposed algorithm can achieve the target bit rate and the decoding-complexity with 0.4% and 1.78% average errors, respectively, for multiple bit rate and decoding-complexity levels. The proposed algorithm also demonstrates a stable frame-wise rate and decoding-complexity control capability when achieving a decoding-complexity reduction of 10.11 (%/dB). The resultant decoding-complexity reduction translates into an overall energy-consumption reduction of up to 10.52 (%/dB) for a 1 dB peak signal-to-noise ratio (PSNR) quality loss compared to the HM 16.0 encoded bit streams

    Broadband DOA Estimation Using Sensor Arrays on Complex-Shaped Rigid Bodies

    Full text link

    Novel Head Related Transfer Function Model for Sound Source Localisation

    No full text
    Human beings have a remarkable ability to determine the direction of arrival of a sound and to separate sounds of interest. Replicating this ability is a challenging problem in audio signal processing. In this paper we present a model for the head related transfer function (HRTF) developed with the localisation objective in mind. This is achieved by splitting the 3D localisation cues in terms of two functions which can be independently evaluated. We illustrate the theory for calculating these functions and validate the results against actual HRTF data. We find the model to be a close match for a significant number of potential source locations

    HRTF aided broadband doa estimation using two microphones

    No full text
    Two sensor broadband direction of arrival (DOA) estimation suffers from an inherent lack of dimensionality due to having just two sensors, yet humans and other animals are able to overcome this limitation using subtle variations introduced by the ears. Application of existing DOA estimation techniques to such systems becomes complicated due to the ill-behaved nature of the Head Related Transfer Function (HRTF). In this paper we present a subband signal extraction and focussing technique which retains the diversity information of the HRTF. We then develop a framework for combining these signals for subspace DOA estimation and investigate the constraints imposed on the single and multi-source DOA estimation problems. Finally, estimation performance is compared with existing techniques and we find performance has improved to be comparable to human localisation abilities

    Active acoustic echo cancellation in spatial soundfield reproduction

    No full text
    The equalization of reverberation effects is essential for spatial soundfield reproduction, but estimation of the reverberant channel presents several challenges to existing equalization techniques. This paper presents a method of active acoustic echo ca

    Robustness analysis of room equalization for soundfield reproduction within a region

    No full text
    Recent works on soundfield reproduction have presented several methods of recreating a desired soundfield within a region. Estimation or prior knowledge of the inverse reverberant channels now becomes an essential element of equalizing the room effects.

    Efficient Multi-Channel Adaptive Room Compensation for Spatial Soundfield Reproduction Using a Modal Decomposition

    No full text
    Mitigating the effects of reverberation is a significant challenge for real-world spatial soundfield reproduction, but the necessity of a large number of reproduction channels increases the complexity and presents several challenges to existing listenin

    QoS, Energy and Cost Efficient Resource Allocation for Cloud-Based Interactive TV Applications

    No full text
    Internet-based social and interactive video applications have become major constituents of the envisaged applications for next-generation multimedia networks. However, inherently dynamic network conditions, together with varying user expectations, pose many challenges for resource allocation mechanisms for such applications. Yet, in addition to addressing these challenges, service providers must also consider how to mitigate their operational costs (e.g., energy costs, equipment costs) while satisfying the end-user quality of service (QoS) expectations. This paper proposes a heuristic solution to the problem, where the energy incurred by the applications, and the monetary costs associated with the service infrastructure, are minimized while simultaneously maximizing the average end-user QoS. We evaluate the performance of the proposed solution in terms of serving probability, i.e., the likelihood of being able to allocate resources to groups of users, the computation time of the resource allocation process, and the adaptability and sensitivity to dynamic network conditions. The proposed method demonstrates improvements in serving probability of up to 27%, in comparison with greedy resource allocation schemes, and a several-orders-of-magnitude reduction in computation time, compared to the linear programming approach, which significantly reduces the service-interrupted user percentage when operating under variable network conditions

    Broadband DOA estimation using sensor arrays on complex-shaped rigid bodies

    No full text
    Sensor arrays mounted on complex-shaped rigid bodies are a common feature in many practical broadband direction of arrival (DOA) estimation applications. The scattering and reflections caused by these rigid bodies introduce complexity and diversity in th
    corecore